Avoiding Over-Fitting in ILP-Based Process Discovery

نویسندگان

  • Sebastiaan J. van Zelst
  • Boudewijn F. van Dongen
  • Wil M. P. van der Aalst
چکیده

The aim of process discovery is to discover a process model based on business process execution data, recorded in an event log. One of several existing process discovery techniques is the ILP-based process discovery algorithm. The algorithm is able to unravel complex process structures and provides formal guarantees w.r.t. the model discovered, e.g., the algorithm guarantees that a discovered model describes all behavior present in the event log. Unfortunately the algorithm is unable to cope with exceptional behavior present in event logs. As a result, the application of ILP-based process discovery techniques in everyday process discovery practice is limited. This paper addresses this problem by proposing a filtering technique tailored towards ILP-based process discovery. The technique helps to produce process models that are less over-fitting w.r.t. the event log, more understandable, and more adequate in capturing the dominant behavior present in the event log. The technique is implemented in the ProM framework.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ILP-Based Process Discovery Using Hybrid Regions

The language-based theory of regions, stemming from the area of Petri net synthesis, forms a fundamental basis for Integer Linear Programming (ILP)-based process discovery. Based on example behavior in an event log, a process model is derived that aims to describe the observed behavior. Building on top of the existing ILP-formulation, we present a new ILP-based process discovery formulation tha...

متن کامل

Filter Techniques for Region-Based Process Discovery

The goal of process discovery is to learn a process model based on example behavior recorded in an event log. Region-based process discovery techniques are able to uncover complex process structures (e.g., milestones) and, at the same time, provide formal guarantees w.r.t. the model discovered. For example, it is possible to ensure that the discovered model is able to replay the event log and t...

متن کامل

An Experimental Evaluation of Passage-Based Process Discovery

In the area of process mining, the ILP Miner is known for the fact that it always returns a Petri net that perfectly fits a given event log. However, the downside of the ILP Miner is that its complexity is exponential in the number of event classes in that event log. As a result, the ILP Miner may take a very long time to return a Petri net. Partitioning the traces in the event log over multipl...

متن کامل

A New ILP Model for Identical Parallel-Machine Scheduling with Family Setup Times Minimizing the Total Weighted Flow Time by a Genetic Algorithm

This paper presents a novel, integer-linear programming (ILP) model for an identical parallel-machine scheduling problem with family setup times that minimizes the total weighted flow time (TWFT). Some researchers have addressed parallel-machine scheduling problems in the literature over the last three decades. However, the existing studies have been limited to the research of independent jobs,...

متن کامل

Decomposed Process Mining: The ILP Case

Over the last decade process mining techniques have matured and more and more organizations started to use process mining to analyze their operational processes. The current hype around “big data” illustrates the desire to analyze ever-growing data sets. Process mining starts from event logs—multisets of traces (sequences of events)—and for the widespread application of process mining it is vit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015